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Abstract 

The longest stretch L{n) of consecutive heads in n i.i.d. coin tosses is seen from the prism of 
large deviations. We first establish precise asymptotics for the moment generating function of 
L(n) and then show that there are precisely two large deviation principles, one concerning the 
behavior of the distribution of L[n) near its nominal value and one away from it. We 

discuss applications to inference and to logarithmic asymptotics of functionals of L(n). 
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1 Introduction 

The earliest reference to the longest stretch of consecutive successes in “random” trials is (as we 
learn in the 1981 English translation [14, p. 138] of the 1928 book of von Mises) in a 1916 paper 
of the German philosopher Karl Marbe and concerns the longest stretch of consecutive births of 
children of the same sex as appearing in the birth register of a Bavarian town. (This was actually 
used by parents to “predict” the sex of their child.) The longest stretch of same-sex births in 200 
thousand birth registrations was actually 17 ~ log2(200 x 10^). Von Mises [13] was apparently the 
first one to study the problem rigorously and his result can be seen in Feller’s Volume 1 [6, Section 
X11112]. 

If Xi,X 2 , ■ ■. are i.i.d. Bernoulli trials, P(Vj = 1) = p, P(Vj = 0) = q := 1 — p, and if L{n) is 
the largest £ such that Vj+i Xi^i = I for some 0 < i < n — then we call the base-l/p 

logarithm log^/pU of n the nominal value of L{n) because, as Erdos and Renyi [4] show (in a more 
general setup in fact; see also [5] and [12]), 

, L(n) , , 

lim --= 1, a.s. (1) 

n^oo logi/pU 
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The distribution of L{n) is not explicit. Yet, there are many estimates. The literature is littered 
with them and one of us recently contributed to it in [8] (where other quantities, such as the number 
of times that longest or shortest runs occur, are also explored). 

Our principal interest in this paper is to see to what extent large deviations theory can be applied 
to the problem of squeezing something useful about the distribution of L(n). We first establish 
logarithmic asymptotics for the moment generating function as re —)• oo. The asymptotics 

split in three parts: the subcritical regime, A < ln(l/p), the supercritical regime, A > ln(l/p), 
and the critical one when A = ln(l/p). These asymptotics can be used in combination with the 
Gartner-Ellis theorem (but see Remark 2 below) to derive a full large deviations principle (LDP). 
There are precisely two LDPs. One concerning the behavior of the distribution of L{n) near its 
nominal value logj^/^re and another far away from it. 

We outline the results below. Our starting point is asymptotics for the moment generating 
function and this is what we do right away. Note that we use In for natural logarithm and log^ for 
logarithm with base b. The symbol an ~ bn means anjbn —1 as re ^ oo. Note also that we use 
the term “Laplace transform” interchangeably with the term “moment generating function”. (The 
variable A ranges over the whole real line.) 

Theorem 1. The moment generating function of L{n) has the following asymptotics: 

(i) Subcritical regime: for A < ln(l/p). 

In Eexp {A L(re)} I'N-' ^ logi/pre; 

(a) Critical regime: for A = ln(l/p), 

In Eexp{AL(re)} 2A logi/pre; 

(Hi) Supercritical regime: for A > ln(l/p), 

In Eexp {A L(re)} ~ (A — ln(l/p)) re. 

To the best of the authors’ knowledge, the asymptotics on the moment generating function in 
Theorem 1 have not explicitly appeared in the literature. To show Theorem 1 there are several 
options. One option is the use of the recursion formula 


n—1 

Eexp{AL(re)} = q p-^E exp (A max{L(re — j — 1), j}) 
i=o 

appearing in [8]. Another possible option is to use Fibonacci-type polynomials, as appearing in the 
combinatorially-derived expressions for the moment generating function in [II]. But the simplest 
method is a good estimate for the distribution of L(re); see Lemma 2. Why this lemma works to 
establish the asymptotics in the subcritical and critical regimes is the subject of Section 2 (Lemmas 
3 and 4). 

One implication of Theorem 1 is that it immediately suggests the form of large deviations of 
L[n). In [7], a large deviations type probability was established in the following form 

lim — lnP(L(re) < k) = —ft, (2) 

n—>-cx) ji 


2 


for a fixed k where /3 is positive constant. Since, however, logi/^n is the nominal value of L{n), 
in the sense that (1) holds, the limit (2) is not strictly speaking a result in the theory of large 
deviations since it is not about the deviation from the most probable point logj^/^n of the random 
variables L[n). A partial answer was recently included in [9] who proved that 

lim -- In P f ^ > 1 + X I = —X ln(l/p), x > 0. (3) 

n^oologi/pU yiogi/pU J 

Despite that the research on head runs is a classical topic with many applications (see for instance 
[1]), no explicit general large deviations principles can be found in the literature. 

The subcritical asymptotics of Theorem 1 corresponds to the convergence L{n)/ logi/pU 1 
almost surely as n —)■ oo. Therefore we can study the large deviations on L(n)/ logi/pU. Let ns first 
define the function A*(x) as 


A*(x) 


+ 00, X < 1, 

(x — 1) ln(l/p), X > 1. 


(4) 


Notice that A* is lower semicontinuous with {x G M : A*(x) < c} compact for all c > 0. This means 
that A* is a good rate function (in the terminology of [3] ). Our references to large deviations theory 
are Dembo and Zeitouni [3] and Wentzell [17]. The following full LDP is obtained as a corollary to 
Theorem 1. 


Corollary 1 (LDP near the nominal value). The normalized longest head run L{n)/ logi^pU sat¬ 
isfies a large deviation principle with a good rate function A*(x) given by (4) and speed logi/pU. 
Namely, 

(i) for any open set O C M, 


lim -- In 

n^oo loSl/p^ 


(a) for any closed set F C M, 


L{n) 

logi/pU 



> — inf A*(x); 
xeo 


lim -- In P [ -— ^ ^ G F I < — inf A*(x). 

n^oo lOgi/pU \ logi/pU / xGF 


(5) 


( 6 ) 


Remark 1. Evidently, the large deviation principle presented in Corollary 1 generalizes the result 
(3) in [9], which comes from choosing the open set O = (1 + x, oo) and the closed set F = [1 + x, oo). 

Remark 2. [Connections with the Gartner-Ellis theorem] The proof of the large deviation upper 
bound (6) comes directly from the Gartner-Ellis theorem (cf. [3]). We note that the rate function 
A* is the Eenchel-Legendre transform of the following function 


A(A) 


- 1 - 00 , A > ln(l/p), 
< 2A, A = ln(l/p), 
^A, A < ln(l/p), 


that is, A*(x) = sup;s^g]g [Ax — A(A)]. There is a slight catch here; to establish the lower bound, the 
Gartner-Ellis theorem requires that the function A be essentially smooth, namely, that limfc_,.oo |A'(Afc)| 
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oo as Afc —)• ln(l/p). But this is not true here. Therefore the Gartner-Ellis theorem does not cover 
our case. If instead we look at the lower bound proposed in the the Gartner-Ellis theorem, then 
we have for any open set O, 


lim - In P f -—G O I > — inf 

n^oo logi/pn \ logi/pU / x&onH 


where H is the so called set 0 /exposed points [3, Page 44] of A*. In our case, it is easy to see that 
the set H consists of only one point H = {!}. So the proposed lower bound from the Gartner-Ellis 
theorem becomes trivial since 

inf A*(a:) = A*(l) = 0. 

xeon/f 

In summary, our large deviation principle in Theorem 1 gives a non-trivial example which the 
Gartner-Ellis theorem does not cover. 


The supercritical regime of Theorem 
function A*(a:) defined by 


A*{x) = 


1 gives another large deviation result with a good rate 


-|-oo, X < 0 , 

< xln(l/p), 0 < X < 1, 

^-|-oo, X > 1. 


(7) 


Corollary 2 (LDP away from the nominal value). The normalized longest head run L{n)/n satisfies 
a large deviations principle with a good rate function A*(x) given by (7) and speed n. Namely, 

(i) for any open set O C M, 


lim — In P 

n^oo 


(a) for any closed set F C M, 


lim — In 

n—>-oo ft 


^ G O ) > - inf A*(x); 
n / xeo 


L{n) 


n 


G F 1 < - inf A*(x) 

x£F 


Another implication of Theorem 1 and its corollaries 1 and 2 is in obtaining asymptotics for 
other functionals of L{n). We summarize the results as follows. 


Corollary 3. (I) If f : 

1 


is continuous and satisfies one of the two conditions 


lim lim 

m^oon-^00 log^/pU 

lim -- 

n^oo logi/pU 


then it holds that 
lim 


In E 


exp logi/pU • /( 


logi/pn 

L{n 


) 1 


/( 




)>m 


= — 00 , 


(A.l) 


In Eexp ^ logi/pTO • 7 • f{-^ -) ^ < 00 , for some 7 > 1 , (A. 2 ) 


'logi/p 


-- In Eexp < logi/p n • /( 

iogi/p n [ 


A(n) 

logi/pU 
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(II) If g : M+ —)> M is continuous and satisfies one of the two conditions 


lim lim — In E 

m^oo n—¥oo Ji 


exp n • g{ -) 

n 

L(n), 


= —oo, 


lim — In E exp {n-j-g{ 
n—>-ao n I n 


< oo, for some 7 > 1 , 


then it holds that 


(B.l) 

(B.2) 


1 f Lin) 1 ~ 

lim — In Eexp < n • q{ -) > = max[g(x) — A*(a;)l. 

n-i>oo n I ^ ’T- xgk ^ ^ ^ ^ ^ 

Here we list several functions / and g for which the conclusions of Corollary 3 hold. The 
verification is included in Section 4. 


• f{x) and g{x) are continuous and bounded. In this case, (A.l), (A.2), (B.l) and (B.2) hold. 

• f{x) = cx", X G M+, where c > 0 and 0 < a < 1. It is proved in Section 4 that (A.l) holds. 

• g{x) satisfies the condition: there is m > 0 such that if | 5 (x)| > m, then x > 1. For instance, 
with Cl, C 2 , C 3 , C 4 , a positive constants, the functions 

cix", 026 ^^®^ , C4ln(x + a) 

satisfy this condition. Condition (B.l) is fulfilled for this type of functions since Ir L(n) \ < 



Some easy conclusions of Theorem 1 concern well-known asymptotics for the moments of L[n). 
Formally taking a derivative at A = 0 of the expression in the subcritical regime gives 

ET(n)^ ~ (logi/p n)^, /c G N. 

The asymptotic expressions of the first two moments can be found in [16], and the higher order 
moments are discussed in [15, page 63]. For convenience, we include the asymptotic mean as follows 

EL(n) = logi/p n + logi/p(l - p) + logyp{e^) - ^ + e{n) ( 8 ) 

where 7 = 0.5772... is Euler’s constant, and e(n) is “small”. 

The rest of the paper is organized as follows. In Section 2 we prove Theorem 1, along with 
some auxiliary results. In Section 3 we prove the large deviation principles, stated in Corollaries 
1 and 2. Some other asymptotics related to Corollary 3 are given in 4. We discuss an application 
to inference in Section 5, and some open problems in Section 6 . To save some space we use the 
abbreviation 


£{n) := logi/pU 

whenever convenient. As usual, we let [xj to be the largest integer n such that n < x and [x] to 
be the smallest integer n such that n > x. 
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2 Laplace transform asymptotics 


We obtain logarithmic asymptotics for Eexp{AL(re)}, for all A € M, in several steps. First, we 
obtain a lower bound valid for all A € M. Then we obtain an upper bound for the subcritical 
case (A < ln(l/p)). These two bounds combined give the exact logarithmic asymptotics for the 
subcritical case. The limit in the critical case (A = ln(l/p)) requires special care and is treated 
separately. Finally, we obtain asymptotics for the supercritical case (A > ln(l/p)). 

Lemma 1. It holds that 

lim -—i-In Eexp{AL(n)} > A, 

n—>oo lo§l/p ^ 

for all A € M. 

Proof. The case A = 0 is trivial. Assume A > 0. Then, for 0 < e < 1, 

Eexp{AL(n)} > E[exp{AL(n)}; L{n) > (1 — e) logi/pu] 

> exp{A(l - e) logi/p n} P(L(n) > (1 - e) log^/p n). 

Hence 

-- In EexpfA L(n)} > A(1 — e) + -In F(L(n) > (1 — e) logi/„ n) 

logi/pU logi/pU 

Since F{L{n) > (1 — e) log^/p n) —>■ 1, 

lim -—^ - In Eexp{AL(n)} > A(1 — e), 

n^oo ^ 

and letting e 4- 0 we obtain the result. When A < 0, we use 

Eexp{AL(n)} > E[exp{AL(n)}; L{n) > (1 + e) log;^/p n] 
and proceed similarly. □ 


The following bound for the distribution of L{n) is known in the literature, but we give a simple 
proof below for completeness. 

Lemma 2. For oZl /c, n G N, 1 < A: < n, 


Proof. Let Xi, A 2 ,... be i.i.d. with P(Xi = 1) = p, P(Xi = 0) = g. Let = Xi + • • • + Xi, i > 1. 
Notice that L{n) < k \i and only if Sm — Sm-k < k for all A: < m < n. By a standard correlation 
inequality. 


f| {S^ - Sm-k < > n < k) = l[{l - p'^) = {1-p'^) 




\rn=k 


m=k 


m=k 


and this is the lower bound. For the upper bound, since, trivially, L{k — 1) < k, we have 


P(L(n) <k) = 


-jA P(L(m) < k) 
P(L(m - 1) < A;)' 

m=k 
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But, since, trivially again, L[m) > L{m — 1) for all m. 


P(L(m — 1) < A:) = P(L(m) < A:) + F{L{m — 1) < k < L{m)), 


and observe that 

P(L(m - 1) < A: < L{m)) = P(L(m - A: - 1) < A:, Xm-k = 0, Xm-k+i = ■■■ = Xm = 1) 

= P(L(m — A: — 1) < A:) qp^ > P(L(m — 1) < A:) qp^. 

Substituting this into the previous display gives P(L(m) < A:) < (1 — qp^)F{L{m — 1) < k) which 
implies that P(L(n) < k) < nm=fc(l “ QP^) = (1 “ as claimed. □ 

We next obtain an upper bound in the subcritical regime. Remember that £(n) =: log^/pTi. 
Lemma 3. It holds that 

lim --In Eexp{AL(n)} < A, 

n^oo logi/pn 

for —oo < A < ln(l/p). 

Proof. Suppose first that 0 < A < ln(l/p), pick e > 0, and write 

EgALH ^ - 1 < +e - 1 > =: A+(n) + B+(n). (9) 

V J V ) 

The first term is estimated as 

- 1 < e) , (10) 


A+(n) < 


£(n) 


and so 


implying that 


For the second term we write 


In A_|.(n) 

£{n) 


< A(1 + e) + o(l). 


lET 

n—^oo l[n) 


< A. 


B+(n) 1 + A:e < < 1 + (k + l)e^ 

< y ^X(l+(k+l)eMn) p 1 

“ ^ \ £(n) 


k=l 


Observe now, from Lemma (2), that 

P(L(n) > A:) = 1 — P(L(n) < A:) < 1 — (1 — — k -h l)p^ < np^, 

for all 0 < A: < n, and, trivially, for all A: > n also. This implies that 

P(L(n) > t) < np^, A > 0, 
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and so 


P I > 1 + te ] < = n • = n-^^. 


i{n) 


Therefore, 


OO 

B+(n) < ^ gAfc££(n) ^-fc£ 


k=l 

OO 


fc£ 


whence 


Since 


= gA(l+£)^(n) ^ ^ (1 ln(l/p) 

k=l 


iEh|±M<,, 

n^oo l[n) 


- — InEe'^^^"'^ f T— lnA+(n) — lnB+(n) 

— ir~\ — = X 

n^cxD r(nj [n^oo £(nj n^cxD z[n) 


the result follows. 

Suppose next that A < 0. For 0 < e < 1, write 


]EgAL(n) ^ g (^gAiW; _ 1 > _e^ +]E _ 1 < _e^ =: /s 

V ^(n) J V ^(n) - J 


For the first term we have 

implying that 

As for the second term, 


A_(n) < P f 

\ iln) 


— 1 > —e 


-— lnA_(n) 
lim —< A. 


n-^-cxD £(n) 


UAl -1 / s 

B„(n)= y E ( 1 - (A: +l)e < < 1 - 

V e{n) 

UAl-l \ 

< y gA(l-(fc+l)£)£(n) p I (y < 1 _ 

- V^(n) 


Since there are only finitely many terms in the sum, we can simply write 


In B(n) 


max I A(1 - (k + l)e) + lim InP 
<Li/eJ-i t n^oo i{n) \i{n) 


lim - < 

n-^-cxD ^(n) l<fc<Ll/eJ 

< max |A(1 — (k + l)e) — ooj = —oo, 
l<fc<Ll/£j-l 

where — oo appears because of Lemma 7 below. We again conclude that lin 

A. 


„(n) + B_(n) 


1 — ks^ I 

(n)-MnEe'^^(”) 


VI □ 


















The critical case is treated next. 


Lemma 4. When A = ln(l/p), it holds that 

1 


lim 

n^oo log^/pU 


In, Eexp {AL(n)} = 2A. 


Proof. Fix sufficiently small e > 0. Using the probability estimates of Lemma 2 we obtain that 
there exist positive constants ci, C 2 such that 


-(l+fce) 


(n + 1 — (1 + ke)i{n)) < P ( > 1 + A:e ) < ci n i _ n _|_ ke)i{n)) 

\i[n) J 


C 2 n 

uniformly over all k such that 


l<k< 


\ I n 


- 1 


e V^(^) 

We first obtain a lower bound. From the estimate above 


= : W,. 


( 11 ) 


> E f> 1 + 

V ) 

> ^E 1 + te < ^ < 1 + (A: + l)e^ 

^ (^e^dn)(i+te). i + ke< ^ <l + (k + l)£ 


Since A = ln(l/p) and £(n) = (lnn)/ln(l/p) have exp(A.£(n)) = exp(lnn) = n. Hence 

Nr, 


>nY^ 


n 


ke 


k=l 

Nr, 


P 


-P(^^ >! + {(; +l)j 


> n ^ C 2 n (n + 1 — (1 + ke)i{n)) — cin (jt, -g i _ -g l)e).^(n)) 


k=l 

Nr, 


= n ^ j^C 2 n ^ (n + 1 — (1 + ke)i{n)) — ci n (n + 1 — (1 + (/c + l)£)l{n)) 
k=l 

=: nS(n). 


Hence 


InEe^'^^"'^ Inn InS(n) , , , , , , , ^ InS(n) 

> ^ \ = ffi(l/p) + ln(l/p) ^ ^ 


£{n) i{n) i{n) \ i^/ • \ /^/ 

We now claim that the last ratio converges to 1. This follows by direct computation: 


InS(n) ~ In 


C2n 

cin^ ^ 

~ In 

C2n 

2 e logi/pn 

2 e logi/pn 

2 e logi/pn 


Inn + o(lnn). 
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Hence we have proved a lower bound: 


InEgALW ^ ^ ^ 

To get an upper bound, we use the decomposition (9) as in the proof of Lemma 3, but with 
A = ln(l/p). The first term is estimated in precisely the same manner; see (10). Hence 


n^oo t[n) 


( 12 ) 


For the second term, we write 


B+(n)=E(e"^W;^ 

V «(«) 


— 1 > e 


Nr, 


k=l 




where Nn is as in (11), giving 


Nr, 


B+(n) < 


^A£(n)[l+(fc+l)£] 


k=l 


Ljn) 

£{n) 


> 1 + ke 


Nr, 


= 




k=l 


L{n) 

£{n) 


> 1 + ks 


Nr, 

— (n + 1 — (1 + ke)£{n)) 

k=l 


from which 


in Bj(n 


Nr, 


i{n 

By direct computation. 

Nr, 


T—< (1 + e) ln(l/p) + ln(l/p) -- Iny^ ci n ^ in + I — il + ke)i{n)). 

) inn 

’ k=l 


in ^ Cl n ^ (n + 1 — (1 + ke)i{n)) ~ in 


k=l 


cin 


■logi/p 


n 


= inn + o(lnn). 


Combining the last two displays and letting e 0 we obtain 

In B+(n) 


lim 

n-loo i[n) 


< 21n(l/p). 


From the decomposition (9), with the estimates (12) and (13), we conclude that 


lim -- 

n->-oo l[n) 


, lnA+(n) — lnB+(n) 
= max < lim ——, lim 

>-oo li^n 


n—>-oo 


i{n) 


(13) 


< max{ln(l/p), 21n(l/p)} = 21n(l/p). 


□ 
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In order to study the asymptotic behavior of Eexp {A-L(n)} when A > ln(l/p), we use the 
following result. 


Lemma 5. For fixed 0 < x < 1, it holds that 


lim — In 

n—>-cx) Ti 


L{n) 


n 


> X I = —Xln(l/p). 


n— [nx] + 1 


Proof. We apply the inequalities of Lemma 2 with k = [nx] and obtain 

1 - (l - < p > x^ < 1 - (l 

Since, for 1 — (1 — a)^ < Na for all 0 < a < 1, and since 1 — (1 — a)^ > {N — l)a for all sufficiently 
small o > 0, we have that 

(n - [nx]) < p > x^ < (n - [nx] + 1 ) , 

for all sufficiently large n. Taking logarithms, dividing by n, and sending n to oo finishes the 


proof. 

Lemma 6. It holds that 


□ 


lim — In Eexp {A L(n)} = A — ln(l/p), 

n^oo fi 


for A > ln(l/p). 

Proof. For the lower bound, hx 0 < x < 1, write 


EgALH > g gAL(n) P ( 


n 


> X 


n 


and use Lemma 5: 


lim — In > Ax — xln(l/p) A — ln(l/p), as x ^ 1. 


n—^oo ^ 

For the upper bound, pick e > 0 and write 


PgALH ^ p (e^L(n). +E /^eAL(n). ^ 

\ n J \ n 

[Ve\-1 

< ke < < {k + l)e j 

k=i ^ n J 


[i/d-i 

< ^ gA(fc+l)£n p 

k=\ 


L{n) 


> ks 


n 


Hence (with aV b := max(a, b)) 


lim ^ < (Ae) V max {A(/c + l)e —/ce ln(l/p)} 

' “ l<k<[l/£]—l 


n—^oo Tl 


< Ae + A — ln(l/p) —A — ln(l/p), as e —)• 0. 
where we used Lemma 5 again and the assumption that A — ln(l/p) > 0. 


□ 
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Lemma 7 (Theorem 1.1 in [9]). For each x > 0, we have 


lim --In 

n^oc logi/pU 


L{ri) 

logi/pn 


> 1 + X 


For every 0 < x < 1, we have 


ln(l/p). 


lim --In 

n^oo log^/pU 


— In P 


L{n) 

logi/pn 



= X ln(l/p). 


Note that this lemma can be simply derived based on Lemma 2, but what has actually been proved 
in [9] is precise asymptotics without the logarithm. 


3 Large deviations principles 


We study the large deviations principles announced in Corollaries 1 and 2. Consider the logarithmic 
moment generating function of L{n)/ logi/pU, dehned by 


A.n(A) = In Eexp IA L(n)/log, A € M. 


The proof of Corollary 1 is based on the cumulant, namely, 


That this limit exists 

Proposition 1. The 


A(A) := lim -- An{Xlog^, n). 

n->-oo log^^pU 

is a direct consequence of Theorem 1: 
limit in (14) exists and is given by 


A(A) 


+ 00 , A > ln(l/p), 
< 2A, A = ln(l/p), 
_A, A < ln(l/p). 


(14) 


The Fenchel-Legendre transform of A is the function x sup_xg]R[Ax — A(A)] which (as an easy 
calculation shows) is given by the function A* defined in (4): 


sup[Ax — A(A)] = A*(x) 

AgR 


+ 00, X < 1, 

(x — 1) ln(l/p), X > 1. 


Proof of Corollary 1. To prove the upper bound (6) we apply the Gartner-Ellis theorem (cf. Section 
2.3 in [3]). For the lower bound (5), we must give a separate argument. It suffices to prove that 
for a fixed point y > 1, 


lim lim -- In P 

< 5—^0 n^oo ^ 


Ljri) 

logi/pn 



> -(y- l)ln(l/p), 


(15) 
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where By^s is the open ball centered at y with a radius 6. To achieve (15), we write 


Hn) 

logi/pn 


G 


y,S 


= P 


Ljri) 

logi/p n 


> y-5 - 


L{n) 

logi/pTl 


>y + 5 


In order to analyze the logarithm, we apply an inequality in the form ln(a — h) > ln(a) — for 
a > 6 > 0. Therefore, 


lim lim —^ In P ( f ^ & By s 

S^ 0 ,^e{n) \i{n) 


> lim lim ( In P ( „ f > y — 6 ] — 


n^OG 


i{n) 




L{n) 

l{n) 


>y-d) - 


L{n) 

l{n) 


>y + 6 


(16) 


We can apply Lemma 7 to handle the first limit as follows 


lim lim In ^ > 2/ - = lim-(y - 1 - <5) ln(l/p) = -(y - 1) ln(l/p). (17) 

71—>-CXD Y -c(?7-j J (5—>-0 


For the last ratio term in (16), it follows from applying Lemma 7 twice that 
IP >y + ^ < exp{[-(y- 1 + (i)ln(l/p) +ei]^(re)} 

and 

IP(^^ >y-^ > exp{[-(y- 1 -(5)ln(l/p)-e 2 ]^(n)} 
for sufficiently small ei > 0 and 62 > 0. Thus, assuming 251n(l/p) — ei — £2 > 0, 

^ (w - y+'^) _ 1 

*■( IS >!'-«)-'■( 1 © 2 !' + «)” f (fg > 9-«)/P (fg >!/ +4) - 1 

< , ' -TTT^;-)• 0 , as n — 7 > 00 . (18) 

— g( 251 n(l/p)-£:i-e 2 )^(n) _ ^ ^ ^ 

Then (15) follows by substituting (17) and (18) back into (16). □ 

We now pass to the second large deviations principle. Consider the logarithmic moment gener¬ 
ating function of L(n)/n: 


An(A) = In E exp {A L(n)/n} , A G M, 


and define its cumulant by 

A(A) ;= lim — A„(An). (19) 

n^oo fi 

It is again Theorem 1 that is responsible for the existence of the cumulant: 
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Proposition 2. The limit in (19) exists and is given by the formula 


A(A) 


A-ln(l/p), A>ln(l/p), 
0, A < ln(l/p). 


An easy calculation shows that 


sup[Ax — A(A)] = A*{x) 
AgR 


+00, X < 0, 

< x\n{l/p), 0 < X < 1, 

^ +00, X > 1. 


which is the function announced in (7). The proof of Corollary 2 now proceeds along the same 
lines as that of Corollary 1 and is therefore omitted. 


4 Exponential functionals 


The proof of Corollary 3 is straightforward and follows from Varadhan’s integral lemma (cf. [3, 
Section 4.3]). 

We next verify that the function /(x) = c • x“, 0 < a < 1, satisfies the condition (A.l) in 
Corollary 3. Without loss of generality, we assume c > 0 and obtain 


i{n\ 


■InE 


exp ( f(«)/(|^)) ; ™ 


1 


■ InE 


i{n 

^ OO 

i-lnVE 

n\ 


exp {m 


(.{n) 


k=0 

OO 


£{n) 

exp ( c£(n)(^r^ ; m + k < (^)“ < m + (k + 1) 


< In V pfm + k< 

- £(n} ^ V J 


We now apply Lemma 2 with k = \{m + k)^^^£{n)~\ + 1 and obtain 

P ^> (m + = P (^L(n) > \£{n){m + 

= 1 — P ^L(n) < |'^(n)(m + 

< (n- + 
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Combining previous two estimates gives 

1 


i{n) 


InE 


exp ( : /(^) > m 


< 


< 


c(m + 1) + ^ In f V 

[^0 J 

c(m + 1 ) + ^ In f V 
c(m + l) + ;^ln 

\k=o J 

(mV» - 1) ln(l/p) 1 ('^„ck/\n{i/p) -k^/V 2 

9 P(-n\ 


= c{m + 1 ) — 


\k=0 


(m^/“ — 1 ) ln(l/p) 

^ cym + Ij-^-, as n —>■ oo (since a < Ij. 

Therefore (A.l) follows by taking m —)• oo. 

With f{x) = tx", t > 0, 0 < a < 1, we have 

max{/(x) — A*(x)} = maxjtx" — \p{x — 1)}, 

xeR x>i 

where \p := ln(l/p), for brevity. There are two cases: 

Case 1: t > ln(l/p)/a. Then the maximum above is achieved at x* = and equals 


where Ca is the positive quantity 


11-“ Ap “ Ca + Ap, 


C„ = ai-“ — ai-“ . 


Since i{n)f {L{n)/i{n)) = ti{nY °‘L{n)‘^ = tXp ^(Inn)^ "L(n)", Corollary 3 gives 


InE 


gU“-l(lnn)l-“L(n)“ 


In n / 1 , 


1 1 -“ Ap “ Ca + Ap = (In n) 11 -“ Ap “ C^ + 1 . 


Case 2: t < ln(l/p)/a. Then the maximum is achieved at x* = 1 and equals t. Hence 


InE 


t\p-‘-{lnny-°‘L{n)° 


— Inn. 

Ari 


The expressions become neater upon a change of variables and are summarized thus 
Corollary 4. For all t > 0, for all 0 < a < 1, as n ^ oo, 

t 


InE 


„t(lnn)l-“L(n)“ 




In"( 1 /p) 


Inn, 


ln"(l/p) 


t A i-“ / _L. . 

Q,l-a _ Q/l-a + 1 


i/t < IhZlM 

'' — a 

Inn, otherwise. 
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5 An applications to inference 


Let us consider a classical problem in confidence intervals. Let {Xk}i<k<n be an i.i.d. random 
sample from a Bernoulli population X with P(X = 1) = p and P(X = 0) = 1 — p, 0<p<l. Our 
aim in this section is to construct a 100(1 — a)% confidence interval for p with a given significance 
level a, when p is close to 1 (or 0) and n is not very large. 

The normal approximation to the binomial random variable K := Xi does not work well 
when p is close to 1 (or 0). Nevertheless, there are several alternatives in this case: Wilson’s score 
interval [18], the Clopper-Pearson interval [2], and others (such as Jeffreys interval, Agresti-Coull 
Interval etc.). In this section, we propose another confidence interval based on the longest head 
run L[n) with the help of Corollary 1. It turns out that this type of confidence intervals works 
much better than others. 

To construct such confidence intervals, on one hand it comes from Corollary 1 that, for each 

X > 0, 

1 


lim 

n^oo logi/pU 


In P 


( > 1 + X I = —X ■ ln(l/p). 

yiogypU J 


On the other hand. Lemma 7 below states that, for every 0 < x < 1, 


lim 

n—¥oo 


logl/p 


In 


n 


— In P 


logi/p 


< 1 — X 


n 


= X • ln(l/p). 


Combining these two asymptotics gives a 100(1 — a)% confidence interval of p as follows: 

ln(n) — ln(a/2) 1 f ln(n) — ln(—ln(a/2)) 1 \ 


Ip = exp -- 


L{n) 


exp 


L(n) 


( 20 ) 


where L{n) is a point estimate of L{n). A reasonable point estimate of L{n) is 


L(n) = Lobs(n) - 


logi/p(l -p)+ logi/^(e^) - - 


with Lobs(^) being the observed longest head run in n trials, and p := k/n being the sample 
proportion. To see this, firstly we know that in the long run L{n)/ logi^pU 1, therefore we 
want that an estimate satisfies EL(n) —)■ log;^/pn. Secondly, it follows from the mean (8) that 
EL(n) = logi/pUT logi/p(l-p)+ logi/p(eT') - logi/p{l - p) +logi/p{e^) + e(n), which is quite 
close to logi/pH. This explains that (20) is an appropriate confidence interval for p. 

Below we have simulations for the derived confidence interval Ip in (20) when p is close to 1 (the 
case p is close to 0 can be similarly handled), and we make several comparisons with Wilson score 
intervals and Clopper-Pearson intervals. Based on the simulations (see Table 1), it is evident that 
our confidence interval (20) works much better than others when p is close to 1 and n is not very 
large. In Table 2, for larger p and n we apply the normal approximation to the Binomial random 
variable. In this case it turned out that lower bound of the normal approximation intervals works 
better than Wilson score intervals and Clopper-Pearson intervals, but the upper bound does not. 
In any case, our conhdence interval (20) still works the best among them. 
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Table 1. Wilson score interval (WS) — Clopper-Pearson interval (CP) — Longest run interval (LR) 




p = 0.9500 

n = 200 

a = 0.05 


WS: 

CP: 

LR: 

p = 0.9650 
(0.9295,0.9829) 
(0.9292,0.9858) 
(0.9329,0.9696) 

p = 0.9450 
(0.9042,0.9690) 
(0.9037,0.9722) 
(0.9145,0.9611) 

p = 0.9600 
(0.9231,0.9796) 
(0.9227,0.9826) 
(0.9243,0.9656) 

p = 0.9500 
(0.9104,0.9726) 
(0.9100,0.9758) 
(0.9325,0.9694) 

p = 0.9700 
(0.9361,0.9862) 
(0.9358,0.9889) 
(0.9484,0.9767) 



p = 0.98 

n = 200 

a = 0.05 


WS: 

CP: 

LR: 

p = 0.9800 
(0.9497,0.9922) 
(0.9496,0.9945) 
(0.9657,0.9846) 

p = 0.9850 
(0.9568,0.9949) 
(0.9568,0.9969) 
(0.9751,0.9889) 

p = 0.9700 
(0.9361,0.9862) 
(0.9358,0.9889) 
(0.9578,0.9810) 

p = 0.9800 
(0.9497,0.9922) 
(0.9496,0.9945) 
(0.9703,0.9867) 

p = 0.9750 
(0.9428,0.9893) 
(0.9426,0.9918) 
(0.9606,0.9821) 


Table 2. Wilson score interval (WS) — Clopper-Pearson interval (CP) — Longest run interval (LR) 

— Normal approximation (N) 




p = 0.9950 

n = 1000 

a = 0.05 


N: 

WS: 

CP: 

LR: 

p = 0.9950 
(0.9906,0.9994) 
(0.9883,0.9979) 
(0.9884,0.9984) 
(0.9915,0.9955) 

p = 0.9940 
(0.9892,0.9988) 
(0.9870,0.9972) 
(0.9870,0.9978) 
(0.9909,0.9952) 

p = 0.9950 
(0.9906,0.9994) 
(0.9883,0.9979) 
(0.9884,0.9984) 
(0.9919,0.9957) 

p = 0.9960 
(0.9921,0.9999) 
(0.9898,0.9984) 
(0.9898,0.9989) 
(0.9941,0.9969) 

p = 0.9960 
(0.9921,0.9999) 
(0.9898,0.9984) 
(0.9898,0.9989) 
(0.9938,0.9967) 


6 Open problems 

A problem for future research would be the study of a large deviation principle for the random¬ 
dimensional random vector R{n) = {^Ri{n), R2{n),..., of counts of successive runs of 

all lengths. That is, let Ri{n) be the number of head runs of length I up to the n-th coin toss. 
Distributional relations for R{n) were studied in [8]. 

It would further be interesting to obtain large large deviation principles for longest runs in a 
Markov chain. In other words, assume that (A„) is a Markov chain with finite (or countable) state 
space S and let L(x, n) be the longest sojourn time at a state x G 5 before time n. Although there 
are Stein-Chen type estimates [10, 19] for the distribution of such quantities, the errors in these 
estimates are too big for the study of a large deviation principle. We would like to obtain an LDP 
for L(x, n) or for the vector (L(x, n), x a S) which would, by contraction principle, give us an LDP 
for L{n) := sup^^g L{x,n). 
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